Toucan: deciphering the cis-regulatory logic of coregulated genes.

نویسندگان

  • Stein Aerts
  • Gert Thijs
  • Bert Coessens
  • Mik Staes
  • Yves Moreau
  • Bart De Moor
چکیده

TOUCAN is a Java application for the rapid discovery of significant cis-regulatory elements from sets of coexpressed or coregulated genes. Biologists can automatically (i) retrieve genes and intergenic regions, (ii) identify putative regulatory regions, (iii) score sequences for known transcription factor binding sites, (iv) identify candidate motifs for unknown binding sites, and (v) detect those statistically over-represented sites that are characteristic for a gene set. Genes or intergenic regions are retrieved from Ensembl or EMBL, together with orthologs and supporting information. Orthologs are aligned and syntenic regions are selected as candidate regulatory regions. Putative sites for known transcription factors are detected using our MotifScanner, which scores position weight matrices using a probabilistic model. New motifs are detected using our MotifSampler based on Gibbs sampling. Binding sites characteristic for a gene set--and thus statistically over-represented with respect to a reference sequence set--are found using a binomial test. We have validated Toucan by analyzing muscle-specific genes, liver-specific genes and E2F target genes; we have easily detected many known binding sites within intergenic DNA and identified new biologically plausible sites for known and unknown transcription factors. Software available at http://www.esat.kuleuven.ac. be/ approximately dna/BioI/Software.html.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A genetic algorithm for the detection of new cis-regulatory modules in sets of coregulated genes

SUMMARY The implementation of a genetic algorithm is described that provides a fast method of searching for the optimal combination of transcription factor binding sites in a set of regulatory sequences. AVAILABILITY The algorithm can be used transparently as a web service from within the Toucan software. Toucan can be accessed at http://www.esat.kuleuven.ac.be/~saerts/software/toucan.php. A ...

متن کامل

Computational biology: toward deciphering gene regulatory information in mammalian genomes.

Computational biology is a rapidly evolving area where methodologies from computer science, mathematics, and statistics are applied to address fundamental problems in biology. The study of gene regulatory information is a central problem in current computational biology. This article reviews recent development of statistical methods related to this field. Starting from microarray gene selection...

متن کامل

Computational detection of cis-regulatory modules

MOTIVATION The transcriptional regulation of a metazoan gene depends on the cooperative action of multiple transcription factors that bind to cis-regulatory modules (CRMs) located in the neighborhood of the gene. By integrating multiple signals, CRMs confer an organism specific spatial and temporal rate of transcription. RESULTS Based on the hypothesis that genes that are needed in exactly th...

متن کامل

TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis

We present the second and improved release of the TOUCAN workbench for cis-regulatory sequence analysis. TOUCAN implements and integrates fast state-of-the-art methods and strategies in gene regulation bioinformatics, including algorithms for comparative genomics and for the detection of cis-regulatory modules. This second release of TOUCAN has become open source and thereby carries the potenti...

متن کامل

Regulatory effects of cis- and trans-LncRNAs on differential expression of genes following infection with viral hemorrhagic septicemia virus in rainbow trout (Oncorhynchus mykiss)

In this study the cis and trans regulatory effect of long non-coding genes (lncRNA) on the expression of genes in fish infected by Viral hemorrhagic septicemia virus (VHS) was investigated using RNA-seq technology. At the end of experimental period (the thirty fifth day), total RNA was extracted from spleen tissue (group treated with virus) and physiological serum (control group) was used to pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic acids research

دوره 31 6  شماره 

صفحات  -

تاریخ انتشار 2003